Unsupervised Clustering of Comments Written in Albanian Language

نویسندگان

چکیده

Now-a-days, social media and communications in have become very important for services providers those play a key role service quality improvement as well decision making. The consumers’ discussions usually are written their local languages extracting knowledge sometimes is hard problematic. In this field the natural language processing techniques helpful, but different specifics difficulties, some not prosperous enough methods on NLP, especially speaking of language. scientific paper, we tried to solve such problem Albanian spoken Kosovo. Namely, dataset comments, Kosovo (local speaking), collected from media, by use unsupervised clustering techniques, make regarding topic discussion comment. research, text feature extraction (vectorization others) algorithms (K-means, Spectral, Agglomerative, etc.), used with idea find define more appropriate paper shown results conducted experiments about what case other similar or group (those which weak NLP).

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Unsupervised Clustering for Language Identification

The current state of the art in language identification comes from n-gram language models. While these can reach 99% accuracy (Hammarstrom, 2007), they have three major shortcomings. First, n-gram language models are supervised. They require substantial labeled training data in each language in order to be functional. For best results, this training data should also be in the same genre as the ...

متن کامل

Iranian EFL Learners’ Reactions to Different Feedbacks in Writing Classrooms: Teacher Written Comments (TWC) vs. Peer Written Comments (PWC)

The teaching of writing has recently begun to move away from a concentration on the written product to an emphasis on the process of writing. Feedback is a fundamental element of the process approach to writing. It can be defined as input from a reader to a writer with the effect of providing information to the writer for a revision. This study reports on the effectiveness of two types of feedb...

متن کامل

Application of Clustering for Unsupervised Language Learning

We describe a method for automatically learning word similarity from a corpus. We constructed feature vectors for words according to their appearance in different dependency paths in parse trees of corpus sentences. Clustering the huge amount of raw data costs too much time and memory, so we devised techniques to make the problem tractable. We used PCA to reduce the dimensionality of the featur...

متن کامل

‏‎interpersonal function of language in subtitling

‏‎translation as a comunicative process is always said to be associated with various aspects of meaning loss or gain. subtitling as a mode of translating, due to special discoursal and textual conditions imposed upon it, is believed to be an obvious case of this loss or gain. presenting the spoken sound track of a film in writing and synchronizing the perception of this text by the viewers with...

15 صفحه اول

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal of Advanced Computer Science and Applications

سال: 2021

ISSN: ['2158-107X', '2156-5570']

DOI: https://doi.org/10.14569/ijacsa.2021.0120833